Current Issue : July - September Volume : 2014 Issue Number : 3 Articles : 4 Articles
Nowadays, data-intensive problems are so prevalent that numerous organizations in various industries have to face\nthem in their business operation. It is often crucial for enterprises to have the capability of analyzing large volumes\nof data in an effective and timely manner. MapReduce and its open-source implementation Hadoop dramatically\nsimplified the development of parallel data-intensive computing applications for ordinary users, and the combination\nof Hadoop and cloud computing made large-scale parallel data-intensive computing much more accessible to all\npotential users than ever before. Although Hadoop has become the most popular data management framework\nfor parallel data-intensive computing in the clouds, the Hadoop scheduler is not a perfect match for the cloud\nenvironments. In this paper, we discuss the issues with the Hadoop task assignment scheme, and present an improved\nscheme for heterogeneous computing environments, such as the public clouds. The proposed scheme is based on an\noptimal minimum makespan algorithm. It projects and compares the completion times of all task slots� next data block,\nand explicitly strives to shorten the completion time of the map phase of MapReduce jobs. We conducted extensive\nsimulation to evaluate the performance of the proposed scheme compared with the Hadoop scheme in two types of\nheterogeneous computing environments that are typical on the public cloud platforms. The simulation results showed\nthat the proposed scheme could remarkably reduce the map phase completion time, and it could reduce the amount\nof remote processing employed to a more significant extent which makes the data processing less vulnerable to both\nnetwork congestion and disk contention....
Cloud technology has the potential for widening access to high-performance computational resources for e-science\nresearch, but barriers to engagement with the technology remain high for many scientists. Workflows help overcome\nbarriers by hiding details of underlying computational infrastructure and are portable between various platforms\nincluding cloud; they are also increasingly accepted within e-science research communities. Issues arising from the\nrange of workflow systems available and the complexity of workflow development have been addressed by focusing\non workflow interoperability, and providing customised support for different science communities. However, the\ndeployments of such environments can be challenging, even where user requirements are comparatively modest.\nRESWO (Reconfigurable Environment Service for Workflow Orchestration) is a virtual platform-as-a-service cloud\nmodel that allows leaner customised environments to be assembled and deployed within a cloud. Suitable distributed\ncomputation resources are not always easily affordable and can present a further barrier to engagement by scientists.\nDesktop grids that use the spare CPU cycles available within an organisation are an attractively inexpensive type of\ninfrastructure for many, and have been effectively virtualised as a cloud-based resource. However, hosts in this\nenvironment are volatile: leading to the tail problem, where some tasks become randomly delayed, affecting\noverall performance. To solve this problem, new algorithms have been developed to implement a cloudbursting\nscheduler in which durable cloud-based CPU resources may execute replicas of jobs that have become delayed. This\npaper describes experiences in the development of a RESWO instance in which a desktop grid is buttressed with\nCPU resources in the cloud to support the aspirations of bioscience researchers. A core component of the architecture,\nthe cloudbursting scheduler, implements an algorithm to perform late job detection, cloud resource management and\njob monitoring. The experimental results obtained demonstrate significant performance improvements and benefits\nillustrated by use cases in bioscience research....
Advances in Web technology and the proliferation of mobile devices and sensors connected to the Internet have\nresulted in immense processing and storage requirements. Cloud computing has emerged as a paradigm that\npromises to meet these requirements. This work focuses on the storage aspect of cloud computing, specifically on\ndata management in cloud environments. Traditional relational databases were designed in a different hardware\nand software era and are facing challenges in meeting the performance and scale requirements of Big Data. NoSQL\nand NewSQL data stores present themselves as alternatives that can handle huge volume of data. Because of the\nlarge number and diversity of existing NoSQL and NewSQL solutions, it is difficult to comprehend the domain and\neven more challenging to choose an appropriate solution for a specific task. Therefore, this paper reviews NoSQL\nand NewSQL solutions with the objective of: (1) providing a perspective in the field, (2) providing guidance to\npractitioners and researchers to choose the appropriate data store, and (3) identifying challenges and opportunities\nin the field. Specifically, the most prominent solutions are compared focusing on data models, querying, scaling,\nand security related capabilities. Features driving the ability to scale read requests and write requests, or scaling\ndata storage are investigated, in particular partitioning, replication, consistency, and concurrency control.\nFurthermore, use cases and scenarios in which NoSQL and NewSQL data stores have been used are discussed and\nthe suitability of various solutions for different sets of applications is examined. Consequently, this study has\nidentified challenges in the field, including the immense diversity and inconsistency of terminologies, limited\ndocumentation, sparse comparison and benchmarking criteria, and nonexistence of standardized query languages....
As federal funding in many public non-profit organizations (NPO�s) seems to be dwindling, it is of the utmost\nimportance that efforts are focused on reducing operating costs of needy organizations, such as public schools. Our\napproach for reducing organizational costs is through the combined benefits of a high performance cloud architecture\nand low-power, thin-client devices. However, general-purpose private cloud architectures are not easily deployable\nby average users, or even those with some computing knowledge. For this reason, we propose a new vertical cloud\narchitecture, which is focused on ease of deployment and management, as well as providing organizations with\ncost-efficient virtualization and storage, and other organization-specific utilities. We postulate that if organizations are\nprovided with on-demand access to electronic resources in a way that is cost-efficient, then the operating costs may\nbe reduced, such that the user experience and organizational efficiency may be increased. In this paper we discuss\nour private vertical cloud architecture called THUNDER. Additionally, we introduce a number of methodologies that\ncould enable needy non-profit organizations to decrease costs and also provide many additional benefits for the\nusers. Specifically, this paper introduces our current implementation of THUNDER, details about the architecture, and\nthe software system that we have designed to specifically target the needs of underfunded organizations....
Loading....